Select R package that you’d like to conduct the analysis with from the “Select R package” pulldown list.
To upload a dataset, use the ‘Browse’ button in the “Choose file to upload” field.
Hint: the dataset must be fully processed and contain the initial clustering information!
Hint2: for RaceID, the dataset must be preprocessed with version 3 of the package. For Monocle, with version 2. For Seurat, use CRAN-released version 3.
This vignette showcases the use of a dataset from a custom path.
A published dataset stored under “/data/processing/scRNAseq_shiny_app_example_data/GSE81076_raceid.workspaceR/sc.minT1000.RData” will be analyzed. See Grün D, Muraro MJ, Boisset JC, Wiebrands K et al. De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data. Cell Stem Cell 2016 Aug 4;19(2):266-277 for the original publication.
Select RaceID3 as analysis R package. Upload the dataset (wait until complete) and click on ‘Select dataset’ (Figure 1).
Figure 1: R package selection and dataset upload
After some lag, the head of the normalized data appears in the “Input Data” tab (Figure 2). You can also check the dimensions of your matrix and the summary of the TPC (transcript per cell) distribution in the corresponding boxes.
Figure 2: Head of normalized counts and data summary
In the “Cell map and clustering” tab, a cluster membership tsne plot for the preselected number of clusters is displayed for the loaded dataset (Figure 3). A plot of the within-cluster dispersion as a function of cluster number will appear in the “Metrics for cluster number selection” box, anda silhoutte plot illustrating cluster assignment quality alongside it(Figure 4).
Use this information to guide your cluster number choice as described in the package vignette.
The dataset was originally clustered into 6 clusters.
Figure 3: Cluster plots for uploaded dataset
Figure 4: Cluster quality metrics for loaded dataset
You decide to change the number of clusters to e.g. 3. Update the value on the ruler and click on ‘Update cluster plots’. This initiates re-clustering (Figure 5), and after a waiting time, the updated tsne and silhouette plots replace the old plots (Figure 6,Figure 7).
Figure 5: Update cluster number choice
Figure 6: Cluster plots for updated cluster number
Figure 7: Cluster quality metrics for updated cluster number
To obtain markers (by default: 2) for each cluster, click on ‘Get marker genes’ in the ‘Marker Gene Calculation’ page (Figure 8). After a (rather long) while, a table with top markers as well as a heatmap corresponding to it appears (Figure 9).
Figure 8: Request top marker genes
Figure 9: Top marker genes result
To increase the number of markers displayed in the table and on the heatmap, move the ruler above the table. The two outputs will be updated (Figure 10).
Figure 10: Update the number of marker genes displayed
You can download the marker table, use the ‘Download table’ button (Figure 11).
Figure 11: Download cluster marker table
In the “Marker Gene Visualization” tab, you may plot expression of selected genes, as long as they are expressed in at least 1 cell in the dataset. To select a gene, copy one of the top markers into the “GeneID” field in the box and click on ‘Select genes’ (Figure 12).
Figure 12: Select gene IDs for visualization
Check that the gene(s) is(are) expressed in the ‘Genes used’ field (Figure 12).
Modify plot title and expression scale if needed, and click on ‘Plot cell map’ to visualise gene expression for that gene(s) (Figure 13).
Figure 13: Tsne map with marker gene expression
In the “Correlation Analyses” tab, you may query your dataset for the genes most correlated to your genes of interest and obtain pairwise gene expression plot. Again, enter a gene ID in the side box and click on “Select genes” button in this tab (Figure 14).
Figure 14: Select gene IDs for correlation analysis
A violin plot of the pearson correlation calculated for log2-transformed counts will appear, alongside a list of top10 genes with the highest absolute correlation to the selected genes (Figure 15).
Figure 15: Display top correlated genes
To plot pairwise correlation for selected genes, enter gene IDs into the boxes collecting information for X and Y axes in the bottom half of the page, adjust the plot title if necessary, and click on the “Plot expression” button (Figure 16).
Figure 16: Select gene IDs for pairwise expression plot
Pairwise plot of normalized counts will appear (Figure 17).
Figure 17: Pairwise expression plot
A published dataset stored under “/data/processing/scRNAseq_shiny_app_example_data/GSE81076_monocle.workspaceR/minT5000.mono.set.RData” will be analyzed. See Grün D, Muraro MJ, Boisset JC, Wiebrands K et al. De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data. Cell Stem Cell 2016 Aug 4;19(2):266-277 for the original publication.
Select Monocle2 as analysis package. Upload dataset (wait till complete) and click on ‘Select dataset’ (Figure 18).
Figure 18: R package selection and dataset upload
After some lag, the head of the normalized data appears in the “Input Data” tab (Figure 19). You can also check the dimensions of your matrix and the summary of the TPC (transcript per cell) distribution in the corresponding boxes.
Figure 19: Head of normalized counts and data summary
In the “Cell map and clustering” tab, a cluster membership tsne plot for the preselected number of clusters is displayed for the loaded dataset (Figure 20). A plot of delta (distance) versus rho (density) will appear in the “Metrics for cluster number selection” box, and a silhoutte plot illustrating cluster assignment quality alongside it (Figure 21).
Use this information to guide your cluster number choice as described in the package vignette.
The dataset was originally clustered into 17 clusters.
Figure 20: Cluster plots for uploaded dataset
Figure 21: Cluster quality metrics for loaded dataset
You decide to change the number of clusters to e.g. 3. Update the value on the ruler and click on ‘Update cluster plots’. This initiates re-clustering (Figure 22), and after a waiting time, the updated tsne and silhouette plots replace the old plots (Figure 23,Figure 24).
Figure 22: Update cluster number choice
Figure 23: Cluster plots for updated cluster number
Figure 24: Cluster quality metrics for updated cluster number
To obtain markers (by default: 2) for each cluster, click on ‘Get marker genes’ in the ‘Marker Gene Calculation’ page (Figure 25). After a (rather long) while, a table with top markers as well as a heatmap corresponding to it appears (Figure 26). This calculation is done using the Bioconductor scRNAseq analysis package ‘Seurat’.
Figure 25: Request top marker genes
Figure 26: Top marker genes result
To increase the number of markers displayed in the table and on the heatmap, move the ruler above the table. The two outputs will be updated (Figure 27).
Figure 27: Update the number of marker genes displayed
You can download the marker table, use the ‘Download table’ button (Figure 28).
Figure 28: Download cluster marker table
In the “Marker Gene Visualization” tab, you may plot expression of selected genes, as long as they are expressed in at least 1 cell in the dataset. To select a gene, copy one of the top markers into the “GeneID” field in the box and click on ‘Select genes’ (Figure 29).
Figure 29: Select gene IDs for visualization
Check that the gene(s) is(are) expressed in the ‘Genes used’ field (Figure 30).
Modify plot title and expression scale if needed, and click on ‘Plot cell map’ to visualise gene expression for that gene(s) (Figure 30).
Figure 30: Tsne map with marker gene expression
In the “Correlation Analyses” tab, you may query your dataset for the genes most correlated to your genes of interest and obtain pairwise gene expression plot. Again, enter a gene ID in the side box and click on “Select genes” button in this tab (Figure 31).
Figure 31: Select gene IDs for correlation analysis
A violin plot of the pearson correlation calculated for log2-transformed counts will appear, alongside a list of top10 genes with the highest absolute correlation to the selected genes (Figure 32).
Figure 32: Display top correlated genes
To plot pairwise correlation for selected genes, enter gene IDs into the boxes collecting information for X and Y axes in the bottom half of the page, adjust the plot title if necessary, and click on the “Plot expression” button (Figure 33).
Figure 33: Select gene IDs for pairwise expression plot
Pairwise plot of normalized counts will appear (Figure 34).
Figure 34: Pairwise expression plot
A published dataset stored under “/data/processing/scRNAseq_shiny_app_example_data/GSE75478_seuset.umap.RDS” will be analyzed. See Velten L, Haas SF, Raffel S, Blaszkiewicz S et al. Human haematopoietic stem cell lineage commitment is a continuous process. Nat Cell Biol 2017 Apr;19(4):271-281 for the original publication.
Select Seurat3 as analysis package. Upload dataset (wait till complete) and click on ‘Select dataset’ (Figure 35).
Figure 35: R package selection and dataset upload
After some lag, the head of the normalized data appears in the “Input Data” tab (Figure 36). You can also check the dimensions of your matrix and the summary of the TPC (transcript per cell) distribution in the corresponding boxes.
Figure 36: Head of normalized counts and data summary
In the “Cell map and clustering” tab, a cluster membership tsne plot for the preselected number of clusters is displayed for the loaded dataset (Figure 37). A clustree plot will appear in the “Metrics for cluster number selection” box, as long as at least two cluster assignment columns are available in the data. Otherwise, the box will remain blank (Figure 38). A silhoutte plot illustrating cluster assignment quality is displayed alongside it.
Figure 37: Cluster plots for uploaded dataset
Figure 38: Cluster quality metrics for loaded dataset
Use this information to guide your cluster number choice as described in the package vignette.
The dataset was originally clustered into 8 clusters .
You decide to change the resolution to e.g. 0.2. Update the value on the ruler and click on ‘Update cluster plots’. This initiates re-clustering (Figure 39), and after a waiting time, the updated tsne, clustree and silhouette plots replace the old plots (Figure 40,Figure 41).
Figure 39: Update cluster number choice
Figure 40: Cluster plots for updated cluster number
Figure 41: Cluster quality metrics for updated cluster number
To obtain markers (by default: 2) for each cluster, click on ‘Get marker genes’ in the ‘Marker Gene Calculation’ page (Figure 42). After a (rather long) while, a table with top markers as well as a heatmap corresponding to it appears (Figure 43).
Figure 42: Request top marker genes
Figure 43: Top marker genes result
To increase the number of markers displayed in the table and on the heatmap, move the ruler above the table. The two outputs will be updated (Figure 44).
Figure 44: Update the number of marker genes displayed
You can download the marker table, use the ‘Download table’ button (Figure 45).
Figure 45: Download cluster marker table
In the “Marker Gene Visualization” tab, you may plot expression of selected genes, as long as they are expressed in at least 1 cell in the dataset. To select a gene, copy one of the top markers into the “GeneID” field in the box and click on ‘Select genes’ (Figure 46).
Figure 46: Select gene IDs for visualization
Check that the gene(s) is(are) expressed in the ‘Genes used’ field (Figure 46).
Modify plot title and expression scale if needed, and click on ‘Plot tsne map’ to visualise gene expression for that gene(s) (Figure 47).
Figure 47: Tsne map with marker gene expression
In the “Correlation Analyses” tab, you may query your dataset for the genes most correlated to your genes of interest and obtain pairwise gene expression plot. Again, enter a gene ID in the side box and click on “Select genes” button in this tab (Figure 48).
Figure 48: Select gene IDs for correlation analysis
A violin plot of the pearson correlation calculated for log2-transformed counts will appear, alongside a list of top10 genes with the highest absolute correlation to the selected genes (Figure 49).
Figure 49: Display top correlated genes
To plot pairwise correlation for selected genes, enter gene IDs into the boxes collecting information for X and Y axes in the bottom half of the page, adjust the plot title if necessary, and click on the “Plot expression” button (Figure 50).
Figure 50: Select gene IDs for pairwise expression plot
Pairwise plot of normalized counts will appear (Figure 51).
Figure 51: Pairwise expression plot
To keep trace of the parameters you used to generate your plots, it is recommended that you code them either into the plot titles (customizable by the user) or into the file names under which you save your plots.
To keep trace of the R and R packages versions, you might want to inspect the ‘sessionInfo’ tab. This contains the output of the sessionInfo() R command (Figure 52). At the bottom of the page, two buttons are available (Figure 53). Click on ‘Download session info’ or ‘Download your data’ to save the respective file on your computer.
Figure 52: Session Info tab
Figure 53: Download documentation and modified dataset
Lastly, the code behind the app can be retrieved under “https://github.com/maxplanck-ie/scRNAseq_shiny_app” for the given version of the app. The latter you can read at the bottom of the side bar (Figure 54).
Figure 54: App version